Semantics-based pretranslation for SMT using fuzzy matches
نویسندگان
چکیده
Semantic knowledge has been adopted recently for SMT preprocessing, decoding and evaluation, in order to be able to compare sentences based on their meaning rather than on mere lexical and syntactic similarity. Little attention has been paid to semantic knowledge in the context of integrating fuzzy matches from a translation memory with SMT. We present work in progress which focuses on semantics-based pretranslation before decoding in SMT. This involves applying fuzzy matching metrics based on lexical semantics and semantic roles, aligning parse trees based on semantic roles, and pretranslating matching source sentence parts using aligned tree nodes.
منابع مشابه
Statistical Machine Translation based Passage Retrieval for Cross-Lingual Question Answering --- Experiments at NTCIR-6
In this paper, we propose a novel approach for Cross-Lingual Question Answering (CLQA), where the statistical machine translation (SMT) is utilized. In the proposed method, the SMT is deeply incorporated into the question answering process, instead of using it as the pre-processing of the mono-lingual QA process as in the previous work. The proposed method can be considered as exploiting the SM...
متن کاملDynamic Translation Memory: Using Statistical Machine Translation to Improve Translation Memory Fuzzy Matches
Professional translators of technical documents often use Translation Memory (TM) systems in order to capitalize on the repetitions frequently observed in these documents. TM systems typically exploit not only complete matches between the source sentence to be translated and some previously translated sentence, but also so-called fuzzy matches, where the source sentence has some substantial com...
متن کاملConvergence of Translation Memory and Statistical Machine Translation
We present two methods that merge ideas from statistical machine translation (SMT) and translation memories (TM). We use a TM to retrieve matches for source segments, and replace the mismatched parts with instructions to an SMT system to fill in the gap. We show that for fuzzy matches of over 70%, one method outperforms both SMT and TM baselines.
متن کاملA duality between LM-fuzzy possibility computations and their logical semantics
Let X be a dcpo and let L be a complete lattice. The family σL(X) of all Scott continuous mappings from X to L is a complete lattice under pointwise order, we call it the L-fuzzy Scott structure on X. Let E be a dcpo. A mapping g : σL(E) −> M is called an LM-fuzzy possibility valuation of E if it preserves arbitrary unions. Denote by πLM(E) the set of all LM-fuzzy possibility valuations of E. T...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کامل